8 research outputs found

    Review of the techniques used in motor‐cognitive human‐robot skill transfer

    Get PDF
    Abstract A conventional robot programming method extensively limits the reusability of skills in the developmental aspect. Engineers programme a robot in a targeted manner for the realisation of predefined skills. The low reusability of general‐purpose robot skills is mainly reflected in inability in novel and complex scenarios. Skill transfer aims to transfer human skills to general‐purpose manipulators or mobile robots to replicate human‐like behaviours. Skill transfer methods that are commonly used at present, such as learning from demonstrated (LfD) or imitation learning, endow the robot with the expert's low‐level motor and high‐level decision‐making ability, so that skills can be reproduced and generalised according to perceived context. The improvement of robot cognition usually relates to an improvement in the autonomous high‐level decision‐making ability. Based on the idea of establishing a generic or specialised robot skill library, robots are expected to autonomously reason about the needs for using skills and plan compound movements according to sensory input. In recent years, in this area, many successful studies have demonstrated their effectiveness. Herein, a detailed review is provided on the transferring techniques of skills, applications, advancements, and limitations, especially in the LfD. Future research directions are also suggested

    Inverse reinforcement learning from failure

    No full text
    Inverse reinforcement learning (IRL) allows autonomous agents to learn to solve complex tasks from successful demonstrations. However, in many settings, e.g., when a human learns the task by trial and error, failed demonstrations are also readily available. In addition, in some tasks, purposely generating failed demonstrations may be easier than generating successful ones. Since existing IRL methods cannot make use of failed demonstrations, in this paper we propose inverse reinforcement learning from failure (IRLF) which exploits both successful and failed demonstrations. Starting from the state-of-the-art maximum causal entropy IRL method, we propose a new constrained optimisation formulation that accommodates both types of demonstrations while remaining convex. We then derive update rules for learning reward functions and policies. Experiments on both simulated and real-robot data demonstrate that IRLF converges faster and generalises better than maximum causal entropy IRL, especially when few successful demonstrations are available

    Rapidly exploring learning trees

    No full text
    Inverse Reinforcement Learning (IRL) for path planning enables robots to learn cost functions for difficult tasks from demonstration, instead of hard-coding them. However, IRL methods face practical limitations that stem from the need to repeat costly planning procedures. In this paper, we propose Rapidly Exploring Learning Trees (RLT∗ ), which learns the cost functions of Optimal Rapidly Exploring Random Trees (RRT∗ ) from demonstration, thereby making inverse learning methods applicable to more complex tasks. Our approach extends Maximum Margin Planning to work with RRT∗ cost functions. Furthermore, we propose a caching scheme that greatly reduces the computational cost of this approach. Experimental results on simulated and real-robot data from a social navigation scenario show that RLT∗ achieves better performance at lower computational cost than existing methods. We also successfully deploy control policies learned with RLT∗ on a real telepresence robot

    Inverse reinforcement learning from failure

    No full text
    Inverse reinforcement learning (IRL) allows autonomous agents to learn to solve complex tasks from successful demonstrations. However, in many settings, e.g., when a human learns the task by trial and error, failed demonstrations are also readily available. In addition, in some tasks, purposely generating failed demonstrations may be easier than generating successful ones. Since existing IRL methods cannot make use of failed demonstrations, in this paper we propose inverse reinforcement learning from failure (IRLF) which exploits both successful and failed demonstrations. Starting from the state-of-the-art maximum causal entropy IRL method, we propose a new constrained optimisation formulation that accommodates both types of demonstrations while remaining convex. We then derive update rules for learning reward functions and policies. Experiments on both simulated and real-robot data demonstrate that IRLF converges faster and generalises better than maximum causal entropy IRL, especially when few successful demonstrations are available

    VariBAD: a very good method for Bayes-adaptive deep RL via meta-learning

    No full text
    Trading off exploration and exploitation in an unknown environment is key to maximising expected return during learning. A Bayes-optimal policy, which does so optimally, conditions its actions not only on the environment state but on the agent’s uncertainty about the environment. Computing a Bayes-optimal policy is however intractable for all but the smallest tasks. In this paper, we introduce variational Bayes-Adaptive Deep RL (variBAD), a way to meta-learn to perform approximate inference in an unknown environment, and incorporate task uncer- tainty directly during action selection. In a grid-world domain, we illustrate how variBAD performs structured online exploration as a function of task uncertainty. We further evaluate variBAD on MuJoCo domains widely used in meta-RL and show that it achieves higher online return than existing methods

    TERESA: A Socially Intelligent Semi-autonomous Telepresence System

    Get PDF
    TERESA is a socially intelligent semi-autonomous telepresence system that is currently being developed as part of an FP7-STREP project funded by the European Union. The ultimate goal of the project is to deploy this system in an elderly day centre to allow elderly people to participate in social events even when they are unable to travel to the centre. In this paper, we present an overview of our progress on TERESA. We discuss the most significant scientific and technical challenges including: understanding and automati-cally recognizing social behaviour; defining social norms for the interaction between a telepresence robot and its users; navigating the environment while taking into account social features and constraints; and learning to estimate the social impact of the robot’s actions from multiple sources of feedback. We report on our current progress on each of these chal-lenges, as well as our plans for future work

    Transfer Learning for Multiagent Reinforcement Learning Systems

    No full text
    corecore